Replace a Node in a Multi-node Deployment using Single CLI

You can use the upgrade process to replace a faulty node in a multi-node deployment.

Before you begin

  • Ensure the cluster with the faulty node is running EFA 2.5.5 or later.
  • Ensure you have completed the high-availability prerequisites in High-Availability Requirements.
  • Ensure that XCO is not deployed on the replacement node.
  • Ensure that the faulty node is shutdown.

About this task

During this process, the faulty node is decommissioned, the replacement node is provisioned, and the active node is reconfigured to form the cluster.

Perform this procedure on the active node where XCO is installed.

Procedure

  1. Navigate to the directory where the XCO file (*.tar.gz) is untarred.
  2. Run the deployment script.
    $source deployment.sh -I --deploy-suite fabric --deploy-type multi-node --deployipmode ipv4 --virtual-ipv4 10.32.85.119 --replacement-ip 10.20.48.101
    Checking for EFA Stack...
    Deployment mode is upgrade
    Verifying connectivity to 10.32.85.114...
    10.20.48.101 server is reachable...
    You have entered:
    - to redeploy EFA at version 3.4.0 build 12
    - with peer 10.20.48.101
    - and with VIP 10.32.85.119
    - with node replacement
    - with IP Stack ipv4
    - with suites: Fabric Automation
    Verifying if monitor service is running on 10.32.85.111 10.20.48.101...
    Checking system configuration on 10.32.85.111 10.20.48.101...
    Ensuring machine clocks are in sync
    Verifying clocks are approximately in sync
    Checking default gateway reachability on all nodes...
    Completed verification of default gateway reachability on all nodes
    Ensuring peer hostnames are unique
    Verifying unique hostname between nodes
    Hostnames are unique 
    Ensuring compatible OS version
    Verifying Operating System between nodes
    Operating system of all nodes are same
    Making backup
    Removing legacy EFA installation
    Stopping EFA services
    Undeploying EFA application...
    Undeploying ecosystem services
    Undeploying core services
    Removed current application deployment successfully.
    Removing EFA container images
    Removing container images on 10.32.85.111 10.20.48.101...
    Removing EFA OS services
    Removing k3s container orchestration
    Removing database
    Unholding mariadb server
    Removing Database Server
    Unholding mariadb client
    Removing Database Client
    Removing cluster filesystem
    Unholding glusterfs
    Removing keepalived for cluster virtual IP
    Removing database sync tools
    Removing EFA services and utilities
    Proceeding with new EFA installation
    Verifying system requirements
    Verifying system requirements on all nodes......

    The node replacement proceeds. Messages indicate the progress and when the replacement is complete.

  3. Verify the status of XCO after the node replacement. Ensure that all nodes are up.
    $ efa status
    For more information on how to recover SLX configs, refer to the ExtremeCloud Orchestrator CLI Administration Guide, 3.6.0 .